智能论文笔记

The Geometry of Self-supervised Learning Models and its Impact on Transfer Learning

Romain Cosentino , Sarath Shekkizhar , Mahdi Soltanolkotabi , Salman Avestimehr , Antonio Ortega

分类：机器学习

2022-09-18

由于监督模型无法学习可以在具有有限标签的域中概括的域名，因此自我监督学习（SSL）已成为计算机视觉中的理想范式。 SSL的最新流行导致了几种模型的开发，这些模型利用了不同的培训策略，架构和数据扩展政策，而没有现有的统一框架来研究或评估其在转移学习中的有效性。我们提出了一个数据驱动的几何策略，可以使用每个局部诱导的特征空间中的局部邻域分析不同的SSL模型。与考虑参数，单个组件或优化领域的数学近似的现有方法不同，我们的工作旨在探索SSL模型所学的表示歧管的几何特性。我们提出的歧管图指标（MGM）提供了有关可用SSL模型之间的几何相似性和差异的见解，它们在特定的增强方面的不变以及它们在转移学习任务方面的表现。我们的关键发现是两个方面：（i）与普遍的看法相反，SSL模型的几何形状与其训练范式（对比度，无对比性和基于群集）无关；（ii）我们可以根据其语义和增强歧管的几何特性来预测特定模型的传输学习能力。

translated by 谷歌翻译

Class Token and Knowledge Distillation for Multi-head Self-Attention Speaker Verification Systems

Victoria Mingote , Antonio Miguel , Alfonso Ortega , Eduardo Lleida

分类：机器学习

2021-11-06

本文探讨了三种新方法，以利用多头自我关注（MSA）机制和存储器层，提高基于深神经网络（DNN）的扬声器验证（SV）系统的性能。首先，我们建议使用名为Class令牌的学习矢量来替换平均全局汇集机制以提取嵌入式。与全局平均水平池不同，我们的提案考虑了输入的时间结构，其中与文本相关的SV任务相关。类令牌连接到第一个MSA层之前的输入，并且其输出状态用于预测类。为了获得额外的稳健性，我们介绍了两种方法。首先，我们已经开发出古典令牌的贝叶斯估计。其次，我们添加了一个蒸馏的代表令牌，用于使用知识蒸馏（KD）哲学培训一对教师 - 学生对网络，与阶级令牌相结合。此蒸馏令牌受过培训，以模仿教师网络的预测，而类令牌复制真实标签。所有策略都在RSR2015-第II和DeepMine-Part 1数据库上进行了测试，用于文本相关的SV，与使用平均池机制相同的架构相比，提供竞争力的结果来提取平均嵌入。

translated by 谷歌翻译

NNK-Means: Data summarization using dictionary learning with non-negative kernel regression

Sarath Shekkizhar , Antonio Ortega

分类：机器学习

2021-10-15

通过收集大量数据，然后使用获得的数据直接优化系统参数，正在设计越来越多的系统。通常，这是在没有分析数据集结构的情况下完成的。随着任务复杂性，数据大小和参数都增加到数百万甚至数十亿，数据汇总正成为一个主要挑战。在这项工作中，我们通过字典学习〜（DL）研究了数据汇总，利用了最近引入的非负核回归（NNK）图的属性。与以前的DL技术（例如KSVD）不同，我们提出的NNK均值学习了代表输入数据空间的原子的几何词典。实验表明，与KMeans和KSVD的线性和内核版本相比，使用NNK均值的汇总可以提供更好的类别分离。此外，NNK均值是可扩展的，其运行时复杂性与Kmeans相似。

translated by 谷歌翻译

Graph Construction from Data using Non Negative Kernel regression (NNK Graphs)

Sarath Shekkizhar , Antonio Ortega

分类：机器学习 | (统计)机器学习

2019-10-21

Data-driven neighborhood definitions and graph constructions are often used in machine learning and signal processing applications. k-nearest neighbor~(kNN) and $\epsilon$-neighborhood methods are among the most common methods used for neighborhood selection, due to their computational simplicity. However, the choice of parameters associated with these methods, such as k and $\epsilon$, is still ad hoc. We make two main contributions in this paper. First, we present an alternative view of neighborhood selection, where we show that neighborhood construction is equivalent to a sparse signal approximation problem. Second, we propose an algorithm, non-negative kernel regression~(NNK), for obtaining neighborhoods that lead to better sparse representation. NNK draws similarities to the orthogonal matching pursuit approach to signal representation and possesses desirable geometric and theoretical properties. Experiments demonstrate (i) the robustness of the NNK algorithm for neighborhood and graph construction, (ii) its ability to adapt the number of neighbors to the data properties, and (iii) its superior performance in local neighborhood and graph-based machine learning tasks.

translated by 谷歌翻译

Graph Signal Processing: Overview, Challenges and Applications

Antonio Ortega , Pascal Frossard , Jelena Kovačević , José M. F. Moura , Pierre Vandergheynst

分类：

2017-12-01

Research in Graph Signal Processing (GSP) aims to develop tools for processing data defined on irregular graph domains. In this paper we first provide an overview of core ideas in GSP and their connection to conventional digital signal processing, along with a brief historical perspective to highlight how concepts recently developed in GSP build on top of prior research in other areas. We then summarize recent advances in developing basic GSP tools, including methods for sampling, filtering or graph learning. Next, we review progress in several application areas using GSP, including processing and analysis of sensor network data, biological data, and applications to image processing and machine learning.

translated by 谷歌翻译

The Emerging Field of Signal Processing on Graphs: Extending High-Dimensional Data Analysis to Networks and Other Irregular Domains

David I Shuman , Sunil K. Narang , Pascal Frossard , Antonio Ortega , Pierre Vandergheynst

分类：

2012-10-31

In applications such as social, energy, transportation, sensor, and neuronal networks, high-dimensional data naturally reside on the vertices of weighted graphs. The emerging field of signal processing on graphs merges algebraic and spectral graph theoretic concepts with computational harmonic analysis to process such signals on graphs. In this tutorial overview, we outline the main challenges of the area, discuss different ways to define graph spectral domains, which are the analogues to the classical frequency domain, and highlight the importance of incorporating the irregular structures of graph data domains when processing signals on graphs. We then review methods to generalize fundamental operations such as filtering, translation, modulation, dilation, and downsampling to the graph setting, and survey the localized, multiscale transforms that have been proposed to efficiently extract information from high-dimensional data on graphs. We conclude with a brief discussion of open issues and possible extensions.

translated by 谷歌翻译

Heterogeneous Domain Adaptation and Equipment Matching: DANN-based Alignment with Cyclic Supervision (DBACS)

Natalie Gentner , Gian Antonio Susto

分类：机器学习

2023-01-03

Process monitoring and control are essential in modern industries for ensuring high quality standards and optimizing production performance. These technologies have a long history of application in production and have had numerous positive impacts, but also hold great potential when integrated with Industry 4.0 and advanced machine learning, particularly deep learning, solutions. However, in order to implement these solutions in production and enable widespread adoption, the scalability and transferability of deep learning methods have become a focus of research. While transfer learning has proven successful in many cases, particularly with computer vision and homogenous data inputs, it can be challenging to apply to heterogeneous data. Motivated by the need to transfer and standardize established processes to different, non-identical environments and by the challenge of adapting to heterogeneous data representations, this work introduces the Domain Adaptation Neural Network with Cyclic Supervision (DBACS) approach. DBACS addresses the issue of model generalization through domain adaptation, specifically for heterogeneous data, and enables the transfer and scalability of deep learning-based statistical control methods in a general manner. Additionally, the cyclic interactions between the different parts of the model enable DBACS to not only adapt to the domains, but also match them. To the best of our knowledge, DBACS is the first deep learning approach to combine adaptation and matching for heterogeneous data settings. For comparison, this work also includes subspace alignment and a multi-view learning that deals with heterogeneous representations by mapping data into correlated latent feature spaces. Finally, DBACS with its ability to adapt and match, is applied to a virtual metrology use case for an etching process run on different machine types in semiconductor manufacturing.

translated by 谷歌翻译

Time series Forecasting to detect anomalous behaviours in Multiphase Flow Meters

Tommaso Barbariol , Davide Masiero , Enrico Feltresi , Gian Antonio Susto

分类：机器学习 | 人工智能

2022-12-30

An Anomaly Detection (AD) System for Self-diagnosis has been developed for Multiphase Flow Meter (MPFM). The system relies on machine learning algorithms for time series forecasting, historical data have been used to train a model and to predict the behavior of a sensor and, thus, to detect anomalies.

translated by 谷歌翻译

Reservoir kernels and Volterra series

Lukas Gonon , Lyudmila Grigoryeva , Juan-Pablo Ortega

分类：机器学习

2022-12-30

A universal kernel is constructed whose sections approximate any causal and time-invariant filter in the fading memory category with inputs and outputs in a finite-dimensional Euclidean space. This kernel is built using the reservoir functional associated with a state-space representation of the Volterra series expansion available for any analytic fading memory filter. It is hence called the Volterra reservoir kernel. Even though the state-space representation and the corresponding reservoir feature map are defined on an infinite-dimensional tensor algebra space, the kernel map is characterized by explicit recursions that are readily computable for specific data sets when employed in estimation problems using the representer theorem. We showcase the performance of the Volterra reservoir kernel in a popular data science application in relation to bitcoin price prediction.

translated by 谷歌翻译

The Quantum Path Kernel: a Generalized Quantum Neural Tangent Kernel for Deep Quantum Machine Learning

Massimiliano Incudini , Michele Grossi , Antonio Mandarino , Sofia Vallecorsa , Alessandra Di Pierro , David Windridge

分类：机器学习

2022-12-22

Building a quantum analog of classical deep neural networks represents a fundamental challenge in quantum computing. A key issue is how to address the inherent non-linearity of classical deep learning, a problem in the quantum domain due to the fact that the composition of an arbitrary number of quantum gates, consisting of a series of sequential unitary transformations, is intrinsically linear. This problem has been variously approached in the literature, principally via the introduction of measurements between layers of unitary transformations. In this paper, we introduce the Quantum Path Kernel, a formulation of quantum machine learning capable of replicating those aspects of deep machine learning typically associated with superior generalization performance in the classical domain, specifically, hierarchical feature learning. Our approach generalizes the notion of Quantum Neural Tangent Kernel, which has been used to study the dynamics of classical and quantum machine learning models. The Quantum Path Kernel exploits the parameter trajectory, i.e. the curve delineated by model parameters as they evolve during training, enabling the representation of differential layer-wise convergence behaviors, or the formation of hierarchical parametric dependencies, in terms of their manifestation in the gradient space of the predictor function. We evaluate our approach with respect to variants of the classification of Gaussian XOR mixtures - an artificial but emblematic problem that intrinsically requires multilevel learning in order to achieve optimal class separation.

translated by 谷歌翻译